TIMEOR accepts 2 input types: (1) raw .fastq files and SraRunTable (e.g. here) or a (2) RNA-seq time-series read count matrix (e.g. here) and metadata file (e.g. here).
TIMEOR is available online at https://timeor.brown.edu.
Import SraRunTable from GEO* where TIMEOR will process raw data through retrieving .fastq files, quality control, alignment, and read count matrix creation. Read this section below.
Import metadata file** and count matrix *** (skipping raw data retrieval, quality control, alignment, and read count matrix creation) and proceeding straight to normalization and correction. Read this section below.
Then simply follow the prompts. Fill out the grey boxes to begin interacting with each stage and tab.
* SraRunTable from GEO follow instructions in TIMEOR first tab (“Process Raw Data”)
** metadata file requires at least these columns. - ID, condition, time, batch - ID: a unique identifier (ID) for the user (e.g. case_1min_rep1) - condition: one word description (e.g. case, control) - time: numerical values e.g. (0, 20, 40) - batch: string description of batch (e.g. b1, b2, b3)
*** count matrix : rows should be unique gene identifiers (e.g. Flybase, Ensembl or Entrez IDs) and columns should be the IDs from metadata file.
This tutorial uses a subset of real data used in the TIMEOR publication to take the user through TIMEOR’s “Process Raw Data” tab. You will first see this pop-up. Please read. There are 4 steps.
This tutorial uses simulaated data and takes the user through TIMEOR’s full functionality beginning from a read count matrix (genes x sample/time). NOTE: figures with two panels are the same page, just split. There are 20 steps.
The user can begin this tutorial before or after following “Run TIMEOR from Raw Data: Starting from .fastq Time-Series RNA-seq”.
Proceed to Primary Analysis and click “Run”.
At the bottom right you will see a pop-up to click “Render Venn Diagram” in the top right to compare differential expression results between three methods (ImpulseDE2, Next maSigPro, and DESeq2) and choose which method results to proceed with. You will then see a pop-up saying that you have completed Primary Analysis. Feel free to move on with TIMEOR’s default parameters, or explore Primary Analysis options (see next step).
As said in pop-up, proceed to Secondary Analysis tab in side-bar.
Clusters are labeled in ascending order from 1 for top-most cluster. Under Gene Expression Trajectory Clusters choose cluster 1, 2, or 3 in the dropdown. On the right under “Chosen Cluster Gene Set” you will see the genes in that cluster. Genes are the same color as the gene trajectory cluster to which they belong.
Once you have chosen which genes set to test for enrichment, click the “Analyze” toggle to “ON”.
Wait to view any enriched gene ontology (GO) terms (Molecular Function, Biological Process, or Cellular Component), pathway, network, and/or motif analysis. NOTE you may download the interactive motif results for viewing.
Toggle the “Analyze” button to “OFF” to choose another gene set, and repeat steps 9-12.
In that same table on the far right you will see ENCODE IDs indicating published ChIP-seq data for the predicted transcription factors. For this tutorial, can either see an example provided with “pho”. You may also either download these read-depth normalized .bigWig files here (ENCFF467OWR, ENCFF609FCZ, ENCFF346CDA) or follow the prompts (step 16) in the grey box under “Upload .bigWig Files”. Any .bigWig files from protein-DNA data are accepted.
If you are interested, click on the “+” under “See each method’s predicted transcription factors:” to see the ranked lists of transcription factors and motifs by method. Blanks indicate an enriched motif is not assigned to a transcription factor region (to see motifs click ‘Download interactive cluster motif result’). Search for a method (e.g. transfac in blue box), enrichment score, etc. Row names are top 1 - 4 transcription factors.
The original temporal RNA-seq data analyzed in our paper comes from Zirin et al., 2019). In this tutorial SRR8843750 and SRR8843738 are analyzed to demonstrate the “Process Raw Data” tab in which raw RNA-seq data are retrieved, quality checked, aligned (with HISAT2 and Bowtie2), and converted to a read count matrix. The real data subset folder (which TIMEOR automatically generates) can be downloaded here.
The original simulated data folder can be downloaded here.
To get the top 4 TFs a 25% concensus threshold was used, with a normalized enrichement score threshold of 3.
Command used: Rscript get_top_tfs.r /PATH/TO/simulated_results/ dme 3 4 25 /PATH/TO/TIMEOR/
The following bigWig files were collected:
ENCFF467OWR (read-depth normalized signal between both replicates) within dataset ENCSR240ADR for Stat92E
ENCFF609FCZ (read-depth normalized signal between both replicates) within dataset ENCSR681YMA for pho
ENCFF346CDA (read-depth normalized signal between all three replicates) within dataset ENCSR776AVR for CG7786
The results presented in TIMEOR’s publication can be downloaded in TIMEOR’s automatically generated folders here.